Flexible and Robust Method for Missing Loop Detector Data Imputation
نویسندگان
چکیده
1 This work is primarily focused on missing traffic sensor data imputation for the purpose of improving the 2 coverage and accuracy of traffic analysis and performance estimation. Missing data, whether attributable 3 to hardware failure or error detection and removal, is a constant problem in loop and other traffic detector 4 datasets. As the rate of missingness increases, the treatment of missing values quickly becomes the 5 controlling factor in overall data quality. Previously, a number of imputation approaches have been 6 developed for traffic data. However, few studies aim at handling the traffic data with large blocks of missing 7 values for network-wide implementation. A proven predictive mean matching multiple imputation method 8 is introduced and applied to loop detector volume data collected on Interstate 5 in Washington State. Using 9 the iterative multiple imputation by chained equations approach, the spatial correlation between nearby 10 detectors is considered for prediction and the presence of missing data in all predictors is effectively dealt 11 with. The proposed methodology is shown to perform well on a range of missing data patterns including 12 missing completely at random, missing days, and missing months. After applying the imputation method 13 to 20-second data and performing post-imputation aggregation, the results in this study suggest that the 14 proposed method can outperform elementary pairwise regression and produce reliable imputation estimates, 15 even when entire days and months are missing from the dataset. Thus, the predictive mean matching 16 multiple imputation method can be used as an effective approach for imputing missing traffic data in a 17 range of challenging scenarios. 18 K. C. Henrickson, Y. Zou, and Y. Wang 3
منابع مشابه
Missing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملInfluence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons
Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...
متن کاملتحلیل مشاهدات گمشده در مطالعه اثر دوزهای مختلف مکمل ویتامین D بر مقاومت به انسولین در دوران بارداری
Introduction: The aim of this study was to impute missing data and to compare the effect of different doses of vitamin D supplementation on insulin resistance during pregnancy. Methods: A clinical trial study was done on 104 women with diabetes and gestational age less than 12 weeks between 1391 and...
متن کاملمقایسه روش الگوریتم EM و روشهای متداول جانهی دادههای گمشده: مطالعهروی پرسشنامه خوددرمانی بیماران دیابتی
Background and Objectives: Missing data is a big challenge in the research. According to the type of the study and of the variables, different ways have been proposed to work with these data. This study compared five popular imputation approaches in addressing missing data in the questionnaires. Methods: In this study, 500 questionnaires were used for self-medication in diabetic patients. Mi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014